A Multilevel Parallelization Framework for High-Order Stencil Computations

نویسندگان

  • Hikmet Dursun
  • Ken-ichi Nomura
  • Liu Peng
  • Richard Seymour
  • Weiqiang Wang
  • Rajiv K. Kalia
  • Aiichiro Nakano
  • Priya Vashishta
چکیده

Stencil based computation on structured grids is a common kernel to broad scientific applications. The order of stencils increases with the required precision, and it is a challenge to optimize such high-order stencils on multicore architectures. Here, we propose a multilevel parallelization framework that combines: (1) inter-node parallelism by spatial decomposition; (2) intra-chip parallelism through multithreading; and (3) data-level parallelism via singleinstruction multiple-data (SIMD) techniques. The framework is applied to a 6 order stencil based seismic wave propagation code on a suite of multicore architectures. Strong-scaling scalability tests exhibit superlinear speedup due to increasing cache capacity on Intel Harpertown and AMD Barcelona based clusters, whereas weak-scaling parallel efficiency is 0.92 on 65,536 BlueGene/P processors. Multithreading+SIMD optimizations achieve 7.85-fold speedup on a dual quad-core Intel Clovertown, and the data-level parallel efficiency is found to depend on the stencil order.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

PATUS: A Code Generation and Auto-Tuning Framework For Parallel Stencil Computations

PATUS is a code generation and auto-tuning framework for stencil computations targeted at modern multiand many-core processors, such as multicore CPUs and graphics processing units. Its ultimate goals are to provide a means towards productivity and performance on current and future multiand many-core platforms. The framework generates the code for a compute kernel from a specification of the st...

متن کامل

Deriving Stencil Hardware Accelerators from a Single Higher-Order Function

Stencil computations are array based algorithms that apply a computation to all array elements in a fixed regular pattern and can be found in many scientific and engineering applications. Parallelization of these applications becomes more and more important in order to keep up with the demand for computing power. FPGAs offer a lot of computing power but are considered hard to program. In this p...

متن کامل

Using a Dynamic Schedule to Increase the Performance of Tiling in Stencil Computations

A stencil computation determines the values of points in a grid of some dimensionality by repeatedly evaluating a given function of a grid point and its neighbors. The parallelization and optimization of stencil computations are subject of ongoing research. The most prevalent approach is the subdivision of the iteration domain into smaller pieces, called tiles. We give an overview of a method t...

متن کامل

PARALLELIZATION FRAMEWORK FOR SCIENTIFIC APPLICATION KERNELS ON MULTI-CORE/MANY-CORE PLATFORMS by Liu Peng A Dissertation Presented to the FACULTY OF THE USC GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA

ion to allow reasoning about their behavior across a broad range of applications. Programs that are members of a particular class can be implemented differently and the underlying numerical methods may change over time, but the claim is that the underlying 3 patterns have persisted through generations of changes and will remain important into the future. The seven dwarfs defined by Phil Colella...

متن کامل

In-Core Optimization of High-Order Stencil Computations

In this paper, we apply in-core optimization techniques to high-order stencil computations, including: (1) cache blocking for efficient L2 cache use; (2) register blocking and data-level parallelism via single-instruction multipledata (SIMD) techniques to increase L1 cache efficiency; and (3) software prefetching techniques. Our generic approach is tested with a kernel extracted from a 6 th -or...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009